PDFedit pdf manipulation library, gui, tools

PDFedit design documentation

Table of Contents

I. Document purpose
II. PDFEdit general overview
1. Used technologies
Used Libraries
Boost libraries
STL - standard template library
Qt framework
CPP Unit automatic testing
Test class description
Test program
Code and script documentation
Design and user documentation
2. PDFedit layers
Kernel layer
Scripting layer
Gui layer
3. General utilities used in PDFedit
Configuration parser
Rules manager
4. xpdf project in PDFedit
Changes needed for code reuse
Object.h and Object.cc
XRef.h and XRef.cc
Array.h and Array.cc
Dict.h and Dict.cc
Stream.h and Stream.cc
Parser.h and Stream.cc
III. Kernel design description
5. Interface objects
Low level CObjects
General description
Changing objects
Instance life cycle
CPdf modes
Properties changing and revision handling
Provided high level objects
Observing cpage
Changing cpage
Changing page contents
Displaying page
Objects on a page
Instance creation
Annotation initializator
Observing ccontentstream
Changing ccontentstream
Storing pdf operators
Changing pdf operators
Changing pdf operators
Pdfoperator iterator
Extending iterator
6. Kernel internal objects
Stream encoding
Xpdf filters
Encoding filter design
Observers in cobjects
Observers in CPdf
Page tree synchronization
Observers in CPage
Observer for annotation
Graphical state
Obtaining information after every operator
7. Changes to pdf document
3 layer model
The lowest layer
The middle layer
The highest layer
Document saving
Revision saving
Content writing and IPdfWriter
Document cloning
Linearized pdf documents
IV. GUI design description
8. PDF Editor menu and toolbar system
Menus and Toolbars
Lists and items
Special toolbar items
Icon themes
9. Settings system
Static (system) Settings
File format for static settings
Dynamic (user) Settings
File format for user settings
10. Object tree view
MultiTreeWindow class
TreeWindow class
TreeItemAbstract class
TreeItemAbstract subclasses

Document purpose

This document describes design and internals of PDFedit program intended for PDF documents manipulation. It doesn't bring precise code or classes description but rather provides ideas and general information to understand the implementation. If somebody wants to use or reuse this project or understand current state he/she should start with this document and then follow with automatically generated doxygen programming documentation.

Document organization

Document itself is divided into several parts:

PDFEdit general overview

This part describes general information about project internals. Which technologies were used during design and implementation and which helper (utils) classes were implemented as support for particular tasks. Finally describes Xpdf code reusage and modifications neccessary to enable such usage.

Chapter 1. Used technologies

Our project uses several technologies. All of them are open, standardized and generally accepted and free. Their licence policy is compatible with GPL - General public licence

Used Libraries

Boost libraries

Boost is free, highly portable and de facto standard set of libraries for C++ language (see Boost home). Most of new C++ features which are very likely to became part of the standard are firstly implemented and tested here. Also the technical report (TR1) is implemented in boost.

We are using mainly smart pointers especially shared pointers which provide easy to use and safe automatic object life cycle maintainance of shared objects. All objects exported from kernel to higher layers are wrapped by shared_ptr smart pointers.

Boost Iostreams make it easy to create standard C++ streams and stream buffers and provide a framework for defining Filters and attaching them to standard streams and stream buffers. The second feature allows creating a flexible, easy to use and extend solution to support encryption/decryption and compression/decompression of objects.

STL - standard template library

STL is C++ standard set of libraries which provides container, algorithm, iterators (and many more) template classes. Their implementation is highly portable and optimized for high performance. We are using mainly mentioned containers for data storage purposes, iterators for effective data structures traversing. For more information and documentation for STL, see documentation.

Qt framework

Qt is a multiplatform C++ GUI toolkit created and maintained by Trolltech, we are using version 3 of the toolkit. We are using mainly gui (widgets) classes (see Qt classes) and QSA framework for scripting layer. Slightly modified QSA version based on QSA 1.1.4 is included in our source tree.

CPP Unit automatic testing

CPP Unit is a C++ unit testing framework. We are using this framework for automatic testing of kernel interface and its functionality. All test cases are placed in kernel/tests directory and they are linked to kernel/kernel_tests binary output. We have implemented test classes for all interface objects. Each class is specialized for certain class interface object. Each class has general name form


where ClassName stands for tested class. Test class implements test suite which is identified by its name. Main test program runs all test suites specified by name (the section called “Test program”). Each test suit consists of test cases which test particular behavior of tested class.

Test class description

Each test class should inherit from CPP unit TextFixture class. At first CPPUNIT_TEST_SUITE and CPPUNIT_TEST macros should be used to prepare this class to cpp unit framework and to define test case functions. Finally class should be registered to framework, so test program can run it by specified name (this name should follow TEST_CLASSNAME convention).

Each test case should perform operations on tested class and checks Invariant which have to be kept for such operations. CPPUNIT_ASSERT macro should be used to check invariant condition or CPPUNIT_FAIL should be used to force failure of test.

See following example (it can be used as template for new test suite creation).

	class TestClassName : public CppUnit::TestFixture
		// defines this class as test suite
			// This method has to be implemented as test case

			// definition of other test cases
		void setUp()
			// this method should initialize local test class
			// data used in tests

		void tearDown()
			// clean up after setUp method

		void Test()
			// Implementation of testcase

			// ...

			// sometimes we need to test that something throws an
			// exception
				// this operation should throw with this parameters

				// exception hasn't occured, we will force failure
				CPPUNIT_FAIL("This operation should have failed.");
			}catch(ExceptionType & e)
				// ok, exception has been thrown

	// registers this class to CPP Unit framework and assigns it
	// with given name

Test program

There are two sets of input parameters you can specify. The first set of parameters specifies input pdf files or directories. After no file with specified name is found the parameter is assumed to be a name of a test suite to run. Result of these tests is information whether the test was successful, threw an exception or a condition was not met.


Code and script documentation

We are using Doxygen documentation tool. This means that documented parts use special format for comments, where the comment starts with double star, like: /** Doxygen comment */. These comments are then used by doxygen to create html pages (or other formats).

Functions exported to scripting use different kind of documentation in addition to doxygen comments. We had to use format different from doxygen, otherwise doxygen would parse our comments and we would parse doxygen comments, which would lead into confusion. So, besides of ordinary doxygen comment which is located above the function body in .cc file, we write extra comment in corresponding .h file, that is exported to scripting API documentation. Content of the comment is often similar, but often doxygen documentation for programmer contain information not useful or misleading for script users and vice versa.

Documentation for scripts are in comment with first comment character being a dash (unlike another star for doxygen comments), like this comment: /*- Comment for function */. The comment have to be put in a header file directly above the function declaration. Comment for class is similar, but it have equal sign instead of dash, like: /*= Comment for class */. It have to be put directly above class declaration.

Another difference between doxygen and this format is that the comment is treated as docbook code - it can contain docbook tags to format the comment or insert a list or table in it, basically whetever docbook code can be put between <para> and </para> tags.

Among tools for generating documentation there is a perl script, which will parse docbook file and all strings in format of <!--TYPE: filename.h --> will be replaced by generated chunk of documentation from filename.h, that will document the class in it. This is used for wrapper classes, where class methods correspond to same methods in scripting environment. Similarly, all strings in format <!--BASETYPE: filename.h --> will be replaced with similarly generated doumentation chunk, except functions are assumed to be static functions available to script, not methods of class contained in the header file. This is used to comment the base classes (Base, BaseGui and BaseConsole), because slot functions in these classes are exported to scripting as static functions.

Documentation generated this way is then treated as ordinary docbook XML file. The scripting API is documented in User documentation Appendix.

Design and user documentation

Design and user manual are written in Docbook standardized and open free format. XML files (with osasis docbook 4.2 DTD file) which forms (this) design documentation are stored in doc/design directory. Main file is design_doc.xml file which includes all other files. Files specialized for kernel design are stored in doc/design/kernel directory. With same logic, gui design files are stored in doc/design/gui directory.

Chapter 2. PDFedit layers

PDFEdit project is based on 3 layers model:

PDFedit layers

  • Kernel - which has responsibility to maintain pdf document content and provides interface for changes.
  • Script - which has responsibility to wrap kernel interface and export it to user or gui.
  • Gui - which visualizes and makes comfortable usage of all the functionality which is accessible directly from kernel or Script layers.

Kernel layer is build on top of popular open source Xpdf project (see Chapter 4, xpdf project in PDFedit). It reuses xpdf code for low-level pdf document access - reading and decoding content and parsing it to objects as well as displaying functionality. Xpdf objects are transformed to interface internal pdfedit objects which provides additional logic and as such they are exported to higher layers.

Script layer is based on QSA - Qt script for applications, scripting language based on ECMAScript, developed by Trolltech.

Gui layer uses Qt framework also developed by Trolltech. Most of gui parts are based on scripts, which means that user interface is very flexible and changes are possible without need of code recompilation, most changes can be done even in runtime.

This chapter describes PDFedit layers, their comunication interface and responsibilities.

Kernel layer

Kernel, as the lowest layer, is responsible for maintaing of pdf content from file and to provide object interface for making changes to higher layer. We will call this objects as cobjects. More precisely - highlevel cobjects (CPdf, CPage, etc.) which provide higher pdf entities logic and lowlevel cobjects which are pdf data types carrier (CInt, CArray, CDict, CString, etc.). Values stored in lowlevel cobjects are also called properties and they are wrapped by IProperty class. Properties are identified by indirect reference (the way how pdf adresses entities). User of kernel should start with CPdf instance which provides all properties from document as well as access to document pages or outlines. Pages then provide access to Annotations. All cobjects are returned wrapped by shared_ptr (see Chapter 1, Used technologies).

Kernel uses Xpdf code for document content parsing. XPdf's XRef class provides fetching and parsing functionality. Oposite way (from cobjects to file writing) is provided by IPdfWriter implementation. XPdf Stream class is replaced by StreamWriter kernel class.

CXRef class inherits from XRef (xpdf class) and adds internals for storing of changed objects not public for direct user. XRefWriter enables interface for making changes inherited from CXRef. See the section called “3 layer model” for more information.

Kernel architecture

Scripting layer

Scripting architecture

Scripting is base of the editor functionality. Each editor window have its own script context and scripts run independently in them. On creating of each window, the scripting base is constructed (BaseGUI, extended by GUI specific functions from Base). The Base will construct some necessary objects:

  • QSProject - QSA class for scripting project

  • QSInterpreter - script interpreter from QSProject

  • QSImporter - Helper class used for adding and removing objects into scripting environment in runtime

  • QSUtilFactory - Standard QSA utility factory, provides File, Dir and Process object, that will allow scripts to manipulate with files and directories (reading, writing, creating, ...) and with processes (running external commands)

  • QSInputDialogFactory - Standard QSA input dialog factory, allowing scripts to create simple dialogs for requesting user input

Window will set ConsoleWriter object, that will handle script output. There are two classes derived from ConsoleWriter:

  • ConsoleWriterGui, used in GUI mode, which transfers the output to command window

  • ConsoleWriterConsole, used in console mode, which simply writes the output to STDOUT.

Class Base (respectively BaseGui or BaseConsole) export many functions as slots. These are visible as static functions in the script and they are main way of communication between the . BaseCore class, from which the Base class is extended does not provide any static functions, but it provides basic script functionality - garbage collection, support for callback functions, running init scripts and handling of script errors.

Wrappers.  Script need to manipulate with objects in PDf and in editor. Due to the limitations of QSA, every C++ object (except some basic types, such as strings, numbers and some QT types, such as QColor) need to be derived from QObject to be usable in scripting and only functions exported as slots will be available for script. Due to this limitation, wrappers need to exist around most objects, like tree items in object tree (QSTreeItem, QSTreeItemContentStream), PDF objects (QSAnnotation, QSArray , QSContentStream, QSDict , QSIProperty, QSPage , QSPdf, QSPdfOperator and QSStream), class for invoking popup menus (QSMenu) and helper classes related to PDF objects (QSIPropertyarray ,QSPdfOperatorIterator and QSPdfOperatorStack). All wrappers are derived from QSCObject class, which provides some basic function for memory handling and error handling.

Source of script input. 

  • One source of script input are init scripts - they are run on application startup.

  • Another source of script input are menus and toolbars. Each menu or toolbar item have some associated script code which is run when the item is activated. User can see commands invoked by these scripts in teh console window.

  • Third source of script input is the preview window, as interaction with it can result in script functions being called, depending on the mode of the window (different functions will be called in "add new line" and "add new text" mde, for example)

  • Callbacks is anothe rsource of script input. There are some special toolbar items, which either manipulate their internal state or edited document itself when interacted by user (item to switch revision, select current color, edit text, item to show and edit current page number). These items use callbacks to notify the script of their action, so the script may react (for example reacting to another color being selected or react on text in the text edit toolbar item being changed).

  • Finally, user can use the commandline and type in any script code he want to execute.

Scripting API documentation.  Description of static scripting functions, functions provided by settings and PageSpace objects and description of scripting objects and their methods is included in the user documentation.

Console mode.  Functionality in console mode is similar, with few exceptions:

  1. BaseConsole is used instead of BaseGUI. This class extend the Base class with few console-specific functions.

  2. ConsoleWindow instead of PDFEditWindow is used. This class provide some of the functionality for running scripts on console (handling its input and output), similarly as PDFEditWindow does.

  3. PageSpace object is unavailable.

  4. There is no interactivity. Editor will run scripts, as specified on commandline, and the it will exit.

Gui layer

Basic class in GUI is PdfEditorWindow, which represent the main editor window. Application can have more such windows opened, in each of them editing different document. On top of the window is menubar and toolbar (although being on top is only default position, user can move all toolbars as he wants, all toolbars will dock on either of four sides of the editor window, or they can float outside of it).

All toolbars are of ToolBar class, derived from QT class QToolBar. The menubar is standard QT QMenuBar, although filling the menubar and also toolbar with its items and maintaining them and their association to script code is responsility of class Menu. On bottom there is a statusbar, which can be used to show various information (class StatusBar, derived from QT class QStatusBar) Rest of the area not occupied by statusbar, menu and any toolbars is divided by movable splitter on left and right part. Left part is divided by another splitted into part with preview window (class PageSpace) on top and commandline window, providing script input and output (class CommandWindow) on bottom. Right part is also divided by splitter, on upper part there is object tree view (class MultiTreeWindow), on lower part is property editor (class PropertyEditor).

Every element mentioned above (except menu and the preview window) can he hidden and shown by user. Application will remember the element layout (size and position of window, position of splitters and position of toolbars) in settings when closing and will reopen next time with the same layout.

Dialogs are also part of the GUI. Many simple dialogs are handled by script, but as script is unable to create more complex dialogs, some of them had to be implemented directly in C++. They are:

  • AboutWindow - window showing version of editor and information about program and its authors

  • AddItemDialog - Dialog invoked when adding new properties to dictionary (CDict) or new elements to array (CArray) in the tree view.

  • AnnotDialog - Dialog invoked when creating new annotation to fill in its data

  • HelpWindow - Dialog invoked for displaying help. Basically a very simple HTML browser

  • MergeDialog - Dialog invoked when function "Import pages from another document" is invoked. The dialog allow to select pages from another document to import and specify positions at which they should be imported

  • OptionWindow - Dialog for editing the user preferences interactively. The options are organized into tabs, each tab containing elements derived from Option class, which maps one value in settings to widget for editing it. Sublasses of Option are:

    • BoolOption - editing with checkbox as boolean value

    • ComboOption - editing with combobox, allowing to select from list of predefined values

    • DialogOption - generic class for editable string, with "..." button allowing to invoke dialog to edit the option in some alternative and possibly more comfortable way

    • FileOption, derived from DialogOption - editing with possibility to invoke dialog to pick a filename

    • FontOption, derived from DialogOption - editing with possibility to invoke dialog to choose the font and it s parameters interactively

    • StringOption - editing with classical one line edit box

    • IntOption, derived from StringOption - allowed input is limited to integer numbers

    • RealOption, derived from StringOption - allowed input is limited to real numbers

    When the user presses "Ok" or "Apply" button, each of the option editing widgets is asked to save its state in the corresponding option.

  • RefPropertyDialog - Dialog to interactively select target for reference while adding or editing it.

Also, some standard system dialogs (to pick font, color or name of file) are used.

Chapter 3. General utilities used in PDFedit

This chapter deals with utility classes implemented for pdfedit purposes but can be reused also somewhere else (implementation tends to be as much independant on pdfedit as possible). They are stored (with some exceptions) in src/utils dicrectory and when compiled, they are collected in one libutils.a statuc library.


Delinearizator is class which provides simple interface for pdf document delinearization (see also Linearized pdf document). Class instances are created by factory method getInstance (see Factory method design pattern) and one instance handles one pdf file, which has to be linearized. If file is not linearezed, instance is not created and exception is thrown. When instance is created, document can be simply delinearized by delinearize method.

As well as XRefWriter it also uses IPdfWriter impementator for content writing (this can be changed in runtime and provides flexibility for output format).

Delinearizator itslef is build on top of Xpdf XRef class which provides object fetching functionality and cross reference table maintainance. This is used for fetching of all objects (without those which are specific for linearized content) and IPdfWriter implementation is used to write them to the new file.

This class (implemented in src/utils/delinearizator.cc) depends on xpdf code and kernel/pdfwriter module.

Delinearizator architecture

Configuration parser

IConfigurationParser provides interface for underlaying stream parsing where stream data can be somehow (depends on format and parser implementation) transformed to key, value pairs, where key stands for data identifier and value is associated with this key.

Class is template and abstract which means that implementators have to implement all methods and supply data types for key and value. IConfigurationParser is defined in src/utils/confparser.h file. We have implemented simple implementation in StringConfigurationParser class which parses file with simple format:

	# comments are ignore
	% this allso stand for comment by default
	key : value # this key value is associaed with value

Where both key and value are strings. This parser can be configured to ignore comments (strings starting with character from commentsSet), to use different delimiter character (the one which separates key from value) by setting delimiterSet or to set characters which should be considered as blank characters (by setting blankSet).

Configuration parser code doesn't depend on pdfedit or xpdf code and can be reused as it is without any changes. It uses STL streams. In PDFedit project it is used e. g. in ModeController or OperatorHinter classes.

Configuration parser object structure.

Rules manager

RulesManager is simple concept based on association of rule and its target. Implementation uses C++ template mechanism to be generic in way of data type definition for both rule and target. Both types have to fullfill certain contracts (see doxygen program documentation for more details). Rules are keys in internal storage and they are associated with their targets (1:1 relation).

Described storage with data forms RulesManager configuration. Second part is based on IRuleMatcher (implementator of this abstract class). It has responsibility to provide logic related to rules choosing, evaluating of compatibility of rules and defining priority for rules. When findMatching method is called, matcher is considered to choose association from storage which matches given rule the best. Implementator of matcher has to implement class Functor so that it describes when given rule matches given original rule and also provide with priority of this match.

Class user just defines rule and target data types, implements IRuleMatcher for rule data type with matching logic and use class as it is. RulesManager also enables loading rule, value configuration from configuration file. loadFromFile uses Parser template data type. IConfigurationParser implementator with rule type for key and target type for value can be used here.

This concept was used for ModeController class and OperatorHinter in our project. (the section called “ModeController”).

Rules manager object structure


IObserver class which stands for observer in following context is mechanism to allow announcing internal object state change to other objects. Object with internal state which announces is observer handler and classes which monitor (observe) are called observers. This is basic idea of Observer design patter. This implementation keeps basic contracts of this pattern and adds additional functionality to be as flexible as possible.

From class point of view observer handler is class which inherits from ObserverHandler class. This template class provides interface for observer registration and unregistration. Each observer has to be registered before it is notified about changes. When it is no longer interested in changes, it should unregister itself from observed objects. ObserverHandler also provides method which announces all registered observer about change (notifyObservers method). Observers are called in order which depends on their priorities and for same priorities on registration order. Observer handler is responsible to call this notifyObservers method whenever its internal state has changed (and he wants to announce this change) and provides correct parameters for it.

Observer has to implement IObserver abstract class. The most important to implement is notify method. This method is called by observer handler after its state has changed. Observer can use given newValue parameter which holds current value which has changed. Additional information can be obtained from given context (see bellow). notify method can use given values to update its internal state or to do additional actions but in any case it shouldn't modify given data (this can lead to end less loop, because observer handler notifies about change and so observer is notified again and this will never stops).

Observer and observer handler cooperation

Notifying would be rather poor if just newValue was available. So our observer concept adds context to notify method. This context keeps additional information. Context hierarchy is based on IChangeContext abstract class. This provides just information about type of context. notify implementator should check this type and cast given context to correct type and use it as needed. If observer handler doesn't want to give any context, this should keep NULL value.

We have implemented following types of contexts because they were needed by project.

  • BasicChangeContext - additionally gives previous value (originalValue) of newValue
  • ComplexChangeContext - inherits from BasicChangeContext and additionally gives information about value id. This can be used for complex objects or containers where value is stored inside and identifier is value name (in complex object scope) or id (position or key) from container. We have defined following conventions - if value was added, then originalValue is NULL (or something that represents NULL) and when value is removed, newValue is NULL.
  • ScopedChangeContext - inherits from IChangeContext and adds second template parameter for scope data type. Scope is abstraction for area where current newValue has changed. We are using this context for the section called “ProgressObserver”.

New context types can be defined as well and very easily - just inherit from IChangeContext and implement getType to return correct type enum value and add some information specific for context inside to class.

Observer concept classes

Note that all mentioned classes are C++ template classes and they use data type as template parameter. This type stands for data type of value which change is announced (newValue in notify method e. g.). All instances are wrapped by share_ptr (boost smart pointer) to prevent from data life cycle problems because share_ptr correctly shares data between multiple user and they are deallocated in moment when nobody holds shared pointer with them. If shared pointers are used correctly (this means that object wrapped inside is never deallocated by hand), no problems should occure with object instances. This is very important because obsever handler doesn't have any information whether observers, which are registered are alive in moment when it calls their notify method.

Second restriction for implementators and users is that notify method as well as all other methods (also constructors and desctructor) can't throw an exception. This is intention, because observer handler has to guarantee that each observer is called after it finishes notifyObservers method. It doesn't know anything about observers so it also doesn't know how to handle their exceptions. Only reaction would be (to keep contract that all observers are notified) to silently ignore (or to log) exception. This could lead to inconsistencies and so it is safer to forbit exceptions at all. notify method implementator has to keep this in mind because, if exception is thrown and it is forbiden in method signature (throw() clause after method definition), program is forced to terminate by default.


Iterator is a specific implementation of Iterator design pattern. It is used to traverse an arbitrary linked list that meets few requirements. Main goal of this iterator implementation is to be flexible and easily extensible because we need many special iterators iterating only over specific items.

The iterator is bidirectional. Information about previous and next item is obtained from the item itself. Sometimes it is not possible to have a container outside stored items, what would be more flexible, but the information must be stored in the items itself. Item before first item and item after last item are not valid objects.

New special iterator can be easily created from the base iterator just by inheriting and overloading one function which selects valid items. Example of special iterators can be found in the section called “Pdfoperator iterator”.


Chapter 4. xpdf project in PDFedit

Pdfedit project uses Xpdf code for low level pdf content operations, such as pdf object parsing (with quite good matching to the Adobe pdf specification ver. 1.6. ), indirect object resolving, generation of page and text output devices, streams decoding and so on. We have tried to reuse the most of functionality which is not somehow directly related to xpdf application special logic.

To prevent from errors comming from xpdf code as well as to be less depended on xpdf in whole project, this is used just in few places (namely CXref, XRefWriter, CObject, CPdf and CPage classes) and rest of our pdfedit code uses just our classes and special objects. This means that substitution of xpdf code by something different is possible with changes on concrete places and rest of the code doesn't know about that. Project currently uses xpdf in 3.01 version.

Changes needed for code reuse

Code of XPDF project couldn't have been reused without modifications because it is not prepared for making changes to its objects very well (code assumes data reading and using them - not to use and change them). Our code changes can be divided to 3 cathegories:

  • Syntactic - these are changes related to functions/methods signature (const modificators, private methods changed to protected, new parameters, non virtual methods are changed to virtual in some classes).
  • New features - these are changes which produce new functionality required by our project (e. g. clone support for Object and all used values inside).
  • Design - these are changes in xpdf object hierarchy or meaning of some components so they better fit our usage.

For more information see following detailed description.

Object.h and Object.cc

Object class is used as value keeper in whole xpdf code. Design is not very good, because all value types are stored in this one object (even more in one union in Object class) and real value type is then identified by enumeration Object type (returned by getType method). We consider this design not very good because this doesn't prevent user from bad usage and different value type can be obtained than really stored (no internal checking is done and so on). Nevertheless this behavior is kept, because change would require whole xpdf code reorganization. We have focused just to Syntactic and new features changes here.

Xpdf code uses kind of optimization for objects copying and so complex values (such as dictionaries or arrays) are not copied by copy method at all and reference counting is used instead. Our Object usage (used primary in CXref class) requires deep copying and so cloning support is neccessary. We have added new clone method:


which creates new Object instance with deep copy of value held by original. [1] Returned Object instance is change independant on original and so they don't influence each other when one is changed. Cloning of complex value types are delegated directly to specialized clone method implemented on such type.

Syntactic changes simply corrects parameters modificators and all methods with pointer parameters which can have const modificators are changed to have it. This change is just cosmetic and should prevent from bad xpdf code usage.

XRef.h and XRef.cc

XRref class is responsible for pdf objects fetching from stream. Pdf defines so called Indirect pdf object. This objects are identified by their object and generic numbers. XRef keeps and maintains Cross reference table which contains mapping from indirect object identifiers to file offset where object is stored. Internally uses Parser and Lexer classes to parse file stream content to Object.

Design changes

In first XRef had to be prepared for transparent wrapper usage (see Wrapper design patter, so all public methods were changed to virtual and private elements to protected (to enable access to and manipulation with them in inheritance hierarchy). XRef class is then wrapped by CXref (see the section called “CXRef”) class and rest of xpdf code doesn't know difference and can be used without changes (with respect to XRef usage).

CXref reopen [2] functionality requires correct chaning of XRef internal state (which includes entries array reinitialization, trailer creation and so on). This everything was done in construtor in original implementation. Clean up was done in destructor. We have added new protected

	void initInternals(Guint pos);
	void destroyInternals();

methods, which use same code as original one but separated to enable such internal state change anytime during XRef instance's life.

New features

XRefWriter (see the section called “XRefWriter”) (descendant of CXref class which inherits directly from XRef) needs to know where it is safe to put data not to destroy original document data when changes are written to the document (as an Incremental update). To enable this, XRef has new

	Guint eofPos;

field which contains position of %%EOF marker or end of document. Value is set in constructor because it has to be found out anyway and XRefWriter doesn't have to this work again.

XRef class didn't provide information whether pdf reference (object and generation number pair) is known [3] and so it wasn't possible to find out whether object value is null object or it is not present. To solve this problem, we have added new public

	virtual RefState knowsRef(Ref ref);

method which returns state of given reference. State is integer value with predefined constants which may hold:

  • UNUSED_REF - if there is no indirect object with given reference.
  • RESERVED_REF - if reference is reserved to be used, but no indirect object is registered yet. This state is used by CXref class to mark that reference is planned to be used and we are just waiting for some object to be used for it.
  • INITIALIZED_REF - if indirect object with given reference exists. This objects are considered when number of objects is required.

CXref and XRefWriter descendants override this method to reflect object added/reserved by their interface and additional logic (e. g. current revision and so on).

Implementation changes

XRef's getNumObjects returned the size of allocated entries array. This is not very clean, because entries array contains also free and unused entries. Even more array is allocated by blocks and so there are more entries than real objects. This method is not used in xpdf code at all, so it could be reimplemented to return just really used objects (those with state INITIALIZED_REF).

Array.h and Array.cc

Array class represents pdf array object. It is one of complex value type. It may contain number of Object instances. To enable Object cloning, new

	Array * clone()const;

is implemented. It returns new Array instance with same number of elements where each one (Object instance) is cloned (by Object::clone() method).

Dict.h and Dict.cc

Dict class represents pdf dictionary data type. It is one of complex value type holding association of objects with their names (key, value pair where key is name object and value is Object instance).

Design changes

DictEntry used as entry (key, value pair association) kept value (Object instance) as normal instance. This was changed to pointer to instance to enable simpler value updating.

Original code didn't use const modificator for key (char * typed) parameter and so it wasn't clear whether it uses given value and stores it (and so parameter can't be deallocated after method returns) or just use it to get information (so it can be deallocated). This could possibly lead to memory leaks or worse to duplicate deallocation od same memory. To solve this potential problems, all methods which don't store key have const char * key parameter.

New features

Dict as complex object stored in general Object data keeper has to support cloning, so new

	Dict * clone()const;

is added. It returns new Dict instance with same number of entries where each entry is deep copied - name string and associated object (Object instance) is cloned (by Object::clone() method).

New method for simpler updating value has been added:

	Object * update(char * key, Object * val);	    

This method will add new entry if no such entry is in dictionary or replaces an old by given value and original is returned.

Original implementation didn't contain any method for entry removing and so new

	Object * del(const char * key);	 

has been added. This will remove entry with given key and returns associated value.

Stream.h and Stream.cc

Xpdf code defines Stream hierarchy to describe pdf stream objects. Streams (as pdf data types) define container of pdf objects. This container is associated with dictionary which contains information about its length and filters which were used for stream encoding. XRef class reads data from stream or Content stream object is based on stream.

Design changes

Stream is base class for both normal streams represented by BaseStream (Stream descendant) base class and FilterStream (also direct Stream descendant) base class used for all filered streams. This stream objects hierarchy is strictly specialized for reading and can't be used for making changes to stream data. CXref and XRefWriter however needs to make transparent modifications to stream with pdf content (so that xpdf code using Streams doesn't have to be changed very much). This is the reason for some changes in Stream hierarchy design.

Problem with stream modification is solved by new abstract class (base class for all specialized stream modificators) StreamWriter. This defines interface for stream writing (in same way as Stream defines operations for reading). However, implementation of concrete writer requires (such as FileStreamWriter) multiple inheritance, because it needs interface from StreamWriter and also access to concrete BaseStream (in FileStreamWriter it is FileStream) fields. So original inheritance of all direct descendants of Stream and BaseStream had to be changed to virtual (to prevent ambiguity). This model enables transparent usage of StreamWriter as Stream typed instances in xpdf code and as StreamWriter typed instances in our higher level classes (like FileStreamWriter) in pdfedit code for writing.

FilterStream hierarchy is untouched in design way, because our project doesn't change filtered streams directly. It works just with base stream, because FilterStream hierarchy is hard to be reused for encoding. So just decode functionality is used.

Stream hierarchy

New features

Stream object as one of complext value data type which is stored in Object (as all other data types) has to to provide cloning support. We have added abstract

	virtual Stream * clone()=0;	     

method in Stream base class. Each specific stream implementator has to provide its clone implementation. No default implementation is written in Stream directly to force all specific filters provide one. If any of filters is not able to create clone, this method should return NULL. This should not happen, however clone implementation has to be aware of it (and has to check whether filter stream has cloned correctly).

FileStream is stream which reads data directly from FILE stream and so cloning has to copy all data (from stream start to the end - if stream is limited, then just first length bytes) somewhere else. Creation of new file, just for temporarily created clone is not very effective and may produce several problems (not enough free place, creation and removing of temporary file, etc.). We have solved this problem by creating MemStream with buffer containing same data as FileStream. This brakes contract of clone meaning a bit, because cloned stream is not precisely the same as original, because it is represented by another Stream class. Nevertheless it keeps the most important contract, that user of Stream interface doesn't know the difference and clone and original don't affect each other when one is changed.

MemStream represent stream stored in buffer in the memory. So cloning is straightforward and just buffer is copied for new MemStream. All other attributes are set according copied buffer. Buffer copying starts from start field position and lenght bytes are copied. So final MemStream will contain just data used in original one. Finaly needFree field is always set to true, because we have allocated new buffer for clone and so it has to be deallocated when MemStream is destroyed. [4]

EmbedStream is special stream which is stored in some other stream. Clonig is also very simple, because this stream just holds one Stream pointer field and some attributes which doesn't change during instance life cycle. Cloning is then just delegation to cloning of stream field and creating new EmbedStream with cloned value and same parameters which were used for given instance.

FilterStream stream branch represents different types of filters which can be used to encode stream data (Pdf specification describes which filters can be used). Hierarchy and design of filters follows Decorator design patter and each filter works with underlaying stream (which is stored as pointer) field which is typed as Stream (so it can be either stream with data - MemStream, FileStream or EmbedStream - or another filter stream).

FileStream is cloned in similar way as EmbedStream. Each filter implemenetator holds Stream pointer (as already mentioned). This is cloned by clone method (defined in super type). When underlaying stream is cloned (and clone method returned wih non NULL value - which means that this stream supports clonning), current stream creates new same filter stream instance with same configuration parameters (these are usually parameters which were given to it in constructor - but when such attributes can change in time they have to be stored somewhere else in constructor specially for this purpose). [5] General implementation for all filter streams is as follows:

	// clones underlying stream
	Stream * cloneStream=str->clone();
	// if underlying stream doesn't support cloning, it will fail too
	return NULL;

	// creates same typed filter stream with same parameters and cloned
	// stream  
	return new ...(cloneStream[, specific_parameters]);

As mentioned above, some filters are not able to reconstruct parameters given them as constructor parameters and so it is hard to reconstruct same filter. Specially all filters which holds StreamPredictor field has additional field with PredictorContext (added by us):

	struct PredictorContext
	  int predictor;
	  int width;
	  int comps;
	  int bits;

where all parameters needed for StreamPredictor creation are stored. This structure is initialized in constructor and never changed. It is just used for cloning.

Parser.h and Stream.cc

Xpdf reads pdf files using two level mechanism. The lowest level, called Lexer, decodes streams if necessary and reads byte after byte. The second level, called Parser, reads from Lexer byte after byte until one complete object is read. This can be applied recursively. Then it initializes xpdf Object class with type and data of the read object. Parser object can read objects either from a single stream or from an array of streams (simply reading one stream after another could result in incomplete objects returned at the end of a stream). Parser is used to parse all objects including content stream tags.

xpdf Parser class

xpdf Parser class

New features

Content streams can consist of more streams. Decoding and then concatenating of these streams must form a valid content stream. The problem is, that the content stream can be split into streams at almost arbitrary place. Added feature which was missing in the Parser object is the information when it has finished reading one stream and started reading another. Added method

	// End of actual stream
	bool eofOfActualStream () const { return (1 == endOfActStream); }

where endOfActStream is new variable indicating how many objects have been buffered from current stream. Content streams can consist of many small valid content streams. When splitted correctly, user can easily delete/add small content streams. Changes made by PdfEditor can be considered as such small valid content streams. After saving our changes we want to see these changes separately to existing content streams. This new feature is used to split many streams (which create one page content stream) to many small content streams. Because of the object buffering done by Parser the new feature had to be implemented specially this way.

[1] Clone is considered to be same typed object as original with equal value (deep copy). Following contracts has to be fullfiled (if there is not special limitation which is not possible/required to keep - in such situation this must be explicitly described).

	Object * clone=original->clone();
	clone!=original;				// instances must be different
	clone->getType()==original->getType();	// type must be same
	equals(clone, original)==true;		// content is same
	equals(change(clone), original)==false;	// change to clone or orginal doesn't influnce second one
	equals(clone, change(original))=false;
	destroy(clone) && exists(original) == true;	// destroying clone or original keeps seconf one existing
	destroy(original) && exists(clone) == true; 

[2] This method is essential for revision changing, because it forces XRef supertype to do the clean up for current revision and init all internals for revision staring at given position (so all document content before this position).

[3] Information whether there is indirect object with this reference.

[4] Note that MemStream may contain buffer but this buffer is not used whole. This is controled by start field, which says where is the first byte of this concrete stream. This is original behaviour of MemStream's implementation and the reason is that in many situation MemStream is used to manipulate with buffers without need to copy the buffer. So xpdf code just reuses some buffer and says that new MemStream starts from here and its lenght is such value. Such MemStream is marked not to be deallocated in destructor by needFree field.

[5] As an expample, FlatStream stream can't reconstruct original parameters (uses parameters for StreamPredictor initialization in constructor but doesn't store them to internal fields). We have added additional data structures to store such parameters. This kind of work around enables to create filter with same attributes.

Kernel design description

Kernel is object layer which provides interface for manipulation with pdf document, its high level entities (like pages, outlines and annotations) and all properties of entities. All these objects keeps document logic inside and provides interface for higher layer for simple manipulation. Higher layers (GUI and script in our case) should use only these objects to get or change document related information.

All kernel stuff is stored in src/kernel directory and consists of set of classes. Classes are separated to 2 groups according logic related to pdf which is implemented inside class:

  • High level objects - which wrapps pdf high level entities, such as pages (the section called “CXRef”), annotations (the section called “CAnnotation”), outlines, whole document (the section called “CPdf”).

    Each object has certain properties which are defined by pdf specification and those are returned in low level objects form. If pdf entity contains other high level entity in its substructure (like pages contains annotations etc.), this entity is responsible for creation and maintainance of such high level object. CPdf is then root of all high level objects.

  • Low level objects - which wrapps pdf data types. [6] According value type character (what can be stored inside) we will distinguish 2 cathegories of data types:
    • Simple types which holds simple data such as integral values, floating point value, string, name, operator, etc.
    • Complex types which holds other data types in their inside such as array, dictionary and streams.

Following chapters will focus more deeply on particular parts of kernel stuff. At first, all classes which are used as interface objects for higher layers are described. Then some kernel classes which are not part of interface but they are used internally by interface classes. Finally there is description how document changes and revisions are handled.

Chapter 5. Interface objects

As descibed above, kernel comunicates with higher layers (see Chapter 2, PDFedit layers) with objects called cobjects. Those cobjects can be high level and low level. This chapter and its sections describe these objects, their responsibilities and mutual cooperation.

Low level CObjects

Pdf file consists of objects. These objects are referenced from a special structure forming a tree. Objects can be either simple (number, string, name, boolean value, reference, null) or complex (array, dictionary, stream).

CObject class hierarchy

General description

CObjects are objects in pdfedit which represent objects in pdf file. All cobjects are derived from one base class IProperty. Objects form a tree like structure so we can divide objects into single objects (leafs) and composite objects (nodes). This is an example of Composite design pattern. This is a different approach to xpdf implementation where each pdf object is represented by the same huge class. The concept of having exactly one class representing each pdf object leads to many problems:

  • inheriting - unclear oo design, unmanagable, it breaks the idiom of one entity for one purpose

  • adding changing operations - would result in even more monstrous class

  • sometimes value inside class, sometimes outside - unclear oo design

There are many interesting design decisions in xpdf objects implementation. For example memory handling makes it almost unsound to delete objects from complex types. Memory allocation policy, that means who/when/how is to deallocate xpdf Object is a mess which could easily lead to either memory leaks or memory corruption. The new design counted with new object for each different pdf object. Because of the pdf decoding complexity (pdf can be encoded using many filters) these objects use xpdf Object just for initializing and dispatching changes to CXref object. Objects can not be simply copied, because it is not clear if a copy is a deep copy with all indirect object copied or not.

Object class hierarchy

Object class hierarchy

Base class

Every object is derived from one base class - IProperty. This base class is a hanger which can be used to access child objects uniformly. This class is a read only interface for all properties. Objects can be created uninitialized or can be initialized from an xpdf Object of the same type and simple objects cat be initialized directly from a value.

Simple objects

Simple objects are very similar. They share behaviour and because of this also method names (only value keeper is different). They are represented by one class using c++ templates. One template class (CObjectSimple), parameterized by object type, represents all 7 types of simple objects.

Complex objects

It is more difficult with complex types. Each complex type must contain references to its children which are also pdf objects. A design decision was made to use smart pointers for referencing child objects. The reasons are:

  1. Allocation and deallocation policy - we cannot be sure when an object is deallocated nobody holds a pointer to the object. This could be solved by implementing reference counting, but why reimplement the wheel.

  2. Automatic deallocating when not needed.

Pdf objects can be referenced using ids which are similar to pointers. This brings many problems. One of them is the impossibility to delete such objects. Many of the problems are automatically handled by smart pointers.

Array and dictionary

CArray stores its children in a simple container indexed by position. CDict stores its children in container indexed by string which uniquely identifies a property. The beauty of smart pointers arise when deallocating an array, it automatically deallocates its children when not referenced and this is done recursively.


Streams are the most complicated from all pdf objects. The problem is that xpdf can decode pdf files but it can not do it the other way round. (it is because it never needs it) Xpdf impelementation of streams is very rough. We use boost filtering iostream which provide us with the necessary general concept of encoded streams. However we have to implement the filters ourselves. (the easiest way is to save decoded streams without any filters) We do not know the filters as long as we do not change the object. We can modify either raw encoded stream or we can save decoded stream which is automatically encoded when saved using avaliable filters. Each stream consists of a dictionary and stream buffer. The dictionary can not be accessed directly. Dictionary interface is simulated by simple methods which delegate the calls to the dictionary object. Buffer is stored in encoded form allowing us to return the same byte representation of an unchanged object as read from a pdf file. At the time of writing this not all reversed filters have been implemented.

Accessing streams

We access streams using Adapter design pattern implementing open/close interface. We need to be able to read from more streams byte per byte because content streams can be splitted anywhere. We decided to return only xpdf objects.

Changing objects

Every object can be obtained from CXref (see the section called “CXRef”) when knowing its reference number and then changed using iproperty interface. Internal state of special object (cpage, ccontentstream, etc.) depends on these raw objects. Therefore a mechanism was designed to allow special object to be notified about raw changes. Objects implement subject interface from Observer design patternObserver design patter which allows everyone to register observers on these objects. This observer gets notified each time the object was changed.


CPdf class is main pdf document class. It maintains document content using XRefWriter field, document catalog PDF dictionary and provides other specialized highlevel objects such as CPage (see the section called “CPage”) and COutline.

Main responsibility is to keep all objects (it provides) synchronized with data which are used for them. As an example, it has to keep CPage instances synchronized with current state of page tree.

In design terminology, CPdf provides Facade design pattern; to manipulate with file in document scope of view. All internal objects used for particular purposes are hidden from class user and CPdf provides interface for manipulation with it (as an example, CPdf uses XRefWriter (see the section called “XRefWriter”) which enables making changes to document, but exports only CXref (see the section called “CXRef”) which enables just objects fetching - almost same interface as Xpdf XRef class).

Instance life cycle

Instance of CPdf can be create only by getInstance factory method (see Factory method design pattern) and destroyed only by close method described. CPdf instance is one purpose object which maintains exactly one document during its lifetime (between creation and close).

CPdf modes

Each document may be opened in one of several modes. Each controls set of activities which can be done. ReadOnly mode guaranties that no change can be done to document. ReadWrite mode enables making changes but with some restriction (see programming documentation for more information). Finaly Advanced mode brigns full control upon document.

Properties changing and revision handling

All changes to the document are done using XRefWriter as described in Chapter 7, Changes to pdf document. Additional logic and responsibility of CPdf in this direction is to make adpater from IProperty interface of property to xpdf Object required by XRefWriter. Even more it also provides interface to get indirect objects in IProperty form. This means that it obscures low level information about who is doing parsing and storing and what data types are used. Also guaranties that all indirect properties are represented by same instance of IProperty to enable their reasonable usage.

To enable also inter document data exchange (in form of properties), it provides functionality for adding of indirect properties. When property is from other CPdf (this may mean other document), it'll do all neccesary handling for this situation (e. g. all other indirect objects which are refered from added one are added too).

Revision handling is done similar way but in this case without any special logic. Revision changing and geting current revision or cloning is directly delegated to XRefWriter. If document save is required, just checks whether mode is not read only and delegates the rest to XRefWriter

Provided high level objects

CPdf provides high level objects maintaining some specialized part of document Pdf document catalog. These objects brings logic on properties with special meaning in pdf document. [7]

CPdf facade scheme


Pdf dictionaries referenced from Pdf page tree are called page objects. These dictionaries must contain a "Type" entry and the value must be a name object equal to "Page". Page objects are basic building blocks of pdf files. Each page can be independent with all required information stored in its dictionary. Some of its properties can be inherited from parent pages. Page object describes the appereance of the page (witdth, length, fonts used, rotation etc.).

The core of a page is one or more content streams which specify what is on a page (text, pictures, graphics ...).

Observing cpage

Generally, changing an object can result in many other changes. Objects often depend on other objects. Changing a page property can result in
  • redraw of the page

  • redraw of other pages

This is the reason why cpage implements observer subject interface (see the section called “Observers”. An object can be notified if cpage changes.

Changing cpage

Page dictionary can be changed in two ways. Either by cpage methods or by requesting raw dictionary by its reference number. If we do not want to parse the whole pdf file, we do not have the information whether an object is a page or just an object with page type. This problem is solved by Observer design patter. We observe the underlying dictionary which implements the observer subject interface. This way, we know every time the dictionary is changed either by cpage or by cobject.

Changing page contents

Any change to a visible part of a page are delegated to content stream object (the section called “CContentStream”).

Displaying page

Xpdf has the best displaying engine of all tested viewers. All calls to display methods are delegated to this engine. When displaying a page CPage creates xpdf Page object from its actual state. Then it uses Page object method displaySlice() to display a rectangle of a page. Xpdf creates the graphical environment for drawing into an output device. Finally it draws the page into supplied device.

Objects on a page

Everything on a page is in its content stream(s). Every operation in a content stream means altering the actual position by moving the drawing pen from one position to another position creating a rectangle which can be used to constrain each object. The rectangle can be used to order all objects into a structure which can be easily searched. This enables effective selection of only some objects.

Text on a page

Pdf specification does not force pdf converters to keep text structure of converted document. This means that no text element needs to correspond to an element in the original file e.g. paragraphs, lines, words. Even the order of single letters can be arbitrary. Xpdf (or any other sophisticated) viewer only guesses which letter form a word or which words form a line. We use xpdf text engine to extract, search text.

Fonts on a page

Pdf file can refer either to extern system fonts which are specified by pdf specification and each viewer should have these fonts avaliable or it can inline font metrics into the pdf file. The latter option is very tricky because it allows the font to contain only those letters which are actually used. CPage object supports adding fonts which can be used on a page. Fonts must be present in the pdf file or they must refer to system fonts.


Annotation is interactive entity which is associated with rectangle on page. They are organized in Annots array entry in page dictionary and so CPage is responsible for returning of all available annotations and also to provide interface to add new annotation.

Each annotations is described by dictionary, which has to contain at least Type element with Annot and Subtype element with concrete annotation type. Pdf specification describes several types of annotation types (e. g. text annotation - which describes text box floating upon normal page text, link annotation - which enables to jump to the target within document or to perform certain action when link is activated by mouse click). Rect element should be present as well, because it specifies where annotation should be spreaded.

CAnnotation represents such annotation and provides simple interface to manipulate elements in annotation dictionary. It implements ObserverHandler (see the section called “Observers”) to anable user to be informed about changes inside annotation dictionary. This class provides just simple interface for internal manipulation and it is intended to be base class for specific annotation types (no such specialized class are available yet, because they are not required by project in this moment).

Instance creation

Instance can be created in 2 different ways.

First possibility is to use existing annotation dictionary (e. g. fetched from document). This way is used in the section called “CPage” class where already existing annotations are fetched and used for CAnnotation instance.

Second way is to use factory

	static boost::shared_ptr<CAnnotation> 
		createAnnotation(Rectangle rect, std::string annotType);

method (see Factory method design pattern). This method uses internal annotInit static the section called “Annotation initializator”. Intializator is responsible for correct annotation dictionary initialization according given type and for given rectangle. This is safe way how to create new annotation instances.

Annotation initializator

Annotation initializator represented by IAnnotInitializator abstract class. It provides Functor which intialize given dictionary with correct data according given type. Initializator is designed as Composite design pattern and so one initializator class can support initialization of several annotation types (getSupportedList returns annotation types which are supported by this initializator).

We have implemented UniversalAnnotInitializer which adds

	bool registerInitializer(std::string annotType, boost::shared_ptr<IAnnotInitializator> impl, bool forceNew=false);

method responsible for registration of other initializator to composite of initializators. When createAnnotation is called, UniversalAnnotInitializer choose registered implementetator which supports such annotation type. UniversalAnnotInitializer itself just initializes common elements for all annotations (such as Type, Subtype, Rect elements).

CAnnotation class hierarchy


CXRef class wrapps Xpdf XRef class and provides additional functionality with same interface (see Wrapper design patter. It provides with protected interface for making changes and internally stores changed objects. When object should be fetched (fetch method is called), it will check whether this object is already changed and if so, uses changed value. Otherwise delegates to XRef original implementation (this logic is kept in all methods defined in XRef).

CXRef inherits from XRef and so can be polymorphicaly used in xpdf code and this code doesn't need any changes to use CXref functionality. Aditional interface enables changes, but as we want to keep this making changes under control so it is protected and so accessible only for its inheritance descendants.

Added functionality includes:

  • new indirect objects creation - creates new pdf object and associates it with reserved reference.
  • changing of already existing indirect objects. changeObject method which changes object with given reference with given object.
  • changing of document trailer - add, remove or change elements of pdf trailer.
  • checking for type safety - checks whether given object can safely replace original value (in document or currently saved form) according types. Type safe change is consider such change, when new value type is either same as old type, or dereferenced types (if any of types is reference) or if original value is CNull, then new value may have arbitrary type.
  • reopen functionality - reopen method which is responsible for document content reopen with Cross reference table at specified position. This is then used to change current revision of document, where cross reference table position is specific for desired revision.

For more information about CXref usage, please see Chapter 7, Changes to pdf document.

CXref class scheme


Content stream can consist of one or more pdf stream. It is responsible for everything on a page. If anything visible is changed the content stream must be changed. Content stream is a stream processed sequentially. Page can consist of one or more content streams and these streams must be concatenated before reading (objects can be splitted between two content streams). Content stream consists of operators and their operands. Each operator updates graphical state.

Observing ccontentstream

Generally, changing anything visible on a page means changing something in underlying content stream. Because operators are processed sequentially changing of one operator/operand may affect many following operators (e.g. their bounding boxes). Page needs to know if a content stream is changed because it must reparse operators. CContentStream implements observer subject interface so for example cpage (as content stream maintainer) can be informed when it is changed.

Changing ccontentstream

Content stream can be changed in two ways. Either by ccontentstream methods or by requesting raw operator and changing its operands. The third way is to add/delete whole stream. This problem is solved by ???.

Storing pdf operators

Operators are processed sequentially and there are many situations when only some types of operators are needed. Clear solution is to use the Iterator design pattern. With this patter we can process operators one by one. If we need specific iterators we just create another child of basic iterator. There are be simple and composite operators. Operators form a tree-like structure. This is more readable than a list of all operators. So we implement another tree-like queue. Only the first level of operators is stored in ccontentstream. Each composite operator stores its children. This is an example of Composite design pattern. Simple and composite operators are accessed uniformly.

Operator queues

Changing pdf operators

Deleting and inserting an operator is not easy because it is stored in two queues. We have information only about one queue (that one which was used to get the operator) so we need to find out the position in the second queue and change it adequately.


Every page has its content stream which contains the description of every object on a page. Content stream consists of operators and their operands specifying how to alter graphical state of a page. These operators specify what and how it is displayed. They are processed sequentially. If no content stream is avaliable (or is empty) the approperiate page is empty.


Content stream consists of operators and their operands. Operators can be either simple or composite objects. The requirement for processing operators sequentially and representing operators in human readable form resulted in storing each operator in two queues.

Operator overview

Operator overview

Changing pdf operators

Changing the visible object properties means wrap those objects into other objects. Object can depend on previous objects. We need to be able to iterate backwards. Summarizing these requirements with human readable representation of operators leads to these decisions:

The first requirement is needed when simulating the display process of an pdf viewer (e.g. when finding out the absolute position of an operator). Composite design pattern used in representation of pdf operators is very useful, when changing objects (e.g. changing an existing text object to composite is very easy allowing us to add some special formatting). This becomes very useful with the Decorator design patter which allows us to change this object in conjuction with previous changes.

Example of changed pdf operator

Example of changed pdf operator

Pdfoperator iterator

Iterator design pattern allows us to iterate over items. When designed correctly specific iterator can be added easily.


Pdf operator iterator is a bidirectional iterator over PdfOperator objects. This iterator is special. The information about next and previous object is stored in the object itself. This iterator works with smart pointers which brings us the problem of dependency cycle and objects would not get deallocated. This was solved by introducing another type of smart pointer which can handle this. It is very similar to a normal bidirectional linked list. The iterator class holds the information whether it is before the beginning, after the end or at a valid item.

Extending iterator

Basic iterator class implements one Template method. Specific iterators are created by overloading this method. It decides whether an item is valid or not. There are two important children of the base iterator class.

  • AcceptingPdfOperatorIterator class

  • RejectingPdfOperatorIterator class

These two classes either accept a set or operators or accept everything except those iterators. Extending these classes requires only defining the set of operators.

Pdf operator iterators overview

Pdf operator iterators overview

[7] E. g. Pdf describes page as dictionary which contains all neccessary information for page (page attributes), its position in page tree (reference to parent tree node) and its content stored in Content stream. CPage then uses this dictionary for its initialization and provides logic for this dictionary.

Chapter 6. Kernel internal objects


IPdfWriter abstract class provides interface for pdf content writing.

IPdfWriter writes pdf content in two phases:

  • Real data writing and information collecting. This is done during writeContent method. Implementator is responsible to collect all data needed for cross reference section generation.
  • Cross reference section writing. This is done in writeTrailer method, which gets all collected data and writes cross reference table (or stream - depends on implementation), writes trailer dictionary and finally information about last xref position.

Different implementator of IPdfWriter interface can be set by XRefWriter::setPdfWriter method.

Each sequence of writeContent, [writeContent, ]* writeTrailer forms new revision of document in incremental update sense. In this moment OldStylePdfWriter implementation is used which forms old style cross reference table (see pdf specification 2.4.3 Cross reference table chapter).


XRefWriter class inherits from CXref and provides

  • public interface for making changes and storing them to the stream.
  • maintains document revisions and current revision (with logic, what and how it can be do in which revision)
  • keeps its mode
    • paranoid mode forces paranoiCheck method call in changeObject method. This checks whether given reference is known and given object is compared with current one by CXref::typeSafe method. If checking doesn't fail, object can be changed and delegates the rest of the work to the CXRef::changeObject implementation.
    • easy mode doesn't do any kind of checking. This is not very safe, but if class user knows what he is doing (e. g. to correct destroyed objects).

XRefWriter responsibility is to enable writing changes to the XRef's stream to make them visible after all required changes are finished. To separate from concrete implementation of storing, XRefWriter uses IPdfWriter abstract interface for object writing. Concrete implementation can be set in runtime and OldStylePdfWriter is used by default. XRefWriter is just responsible to provide with all changed objects which are retrieved from lower (CXref) layer (mapping is protected).

Underlaying stream with data which is stored in XRef supertype is typed as a Stream (see the section called “Stream.h and Stream.cc”. IPdfWriter requires StreamWriter (see the section called “Changes needed for code reuse”). So XRefWriter uses StreamWriter implementator in constructor and XRef use this stream when fetching objects and IPdfWriter uses it when writing content.

Finally XRefWriter keeps information needed for revision handling. [8] XRefWriter keeps an array of all available revisions. Index stands for revision number (0 is the newest one) and value is stream offset of associated cross reference section start for revision. This is used to get information about revision - for example to get revision size (see XRefWriter::getRevisionSize method) - and to enable revision changing. CXref implements

	void CXref::reopen(size_t xrefOff);

method which is responsible for clear XRef state change to forget all objects from current revision and to start parsing of cross reference section from given position. This enables to (rather simply) change current revision and so see documents in those state.

PDF format is prepared for such revisions and multi version documents very well, but doesn't support any kind of branching which means that changes can be done only for the newest revision. XRefWriter take in mind also this aspect and so all methods take care of current revision. This means that anytime current revision is not the newest one, changed objects are ignored and everything is delegated directly to the lowest layer. Also all methods producing changes throws ReadOnlyDocumentException. Even more all methods reimplemented by CXRef class, which depedns on changes, are omited and directly XRef implementation is used instead.

Stream encoding

Pdf streams can be encoded using one or more filters specified by pdf specification. The encoding/decoding algorithm is called a filter. There are two types of filters

  • ASCII filters - enable decoding of arbitrary 8-bit binary data that has been encoded as ASCII text

  • (de)compression filters - enable decoding of data that has been compressed

There is a set of filters which every pdf viewer must implement. It is up to the viewer how it handles an unknown filter.

Xpdf filters

Xpdf viewer implements all decoding filters from pdf specification v1.5. However it does not implement any encoding filters (it does not need them).

Simple xpdf stream hierarchy

Simple xpdf stream hierarchy

Xpdf stream design is good but can be improved. The design resulted into many objects that are tightly coupled together (see picture). They can be decoupled using Facade design pattern which also makes objects more reusable. Finally, the unclear implementation makes it difficult to use and very difficult to extend and it lacks almost any fault tolerance. We use xpdf filters (because it is hardwired into xpdf parsing) only to decode data but we design our own encoding filters.

Encoding filter design

The idea behind filtering streams is quite simple and common so we use boost filtering iostreams which give us a very nice filtering input/output concept. Filtering stream consists of filters and devices. A device can be either a sink(input) a or a source(output). We need just buffer encoding so we use only sources (devices you can read from). We can connect arbitrary number of filters to source. When we read from a filtering input stream, the read call is delegated through the chain of filters to first device which returns raw character(s). Then every filter, in reverse order, encodes the character(s) and passes it to the next filter. Finally an encoded character(s) falls from the filtering input stream. Which filters to use is specified in stream dictionary.

Boost stream concept in pdfedit

Stream concept in pdfedit

This design is flexible, easily extensible as we only have to implement the new filter and connect it to the source when the filter name matches the filter name specified in the stream dictionary.

Observers in cobjects

Generic observer mechanism described in the section called “Observers” is used in our cobject implementation. IProperty base class implements IObserverHandler interface and so all its descendants are responsible for notifying about changes (see the section called “Low level CObjects”.

All cobjects provides BasicChangeContext which contains previous value of changed one. Because complex types can add and remove its elements, special behaviour is specified. New value given as mandatory parameter of notify method may be CNull if element is deleted from comlex type. Old value in the context may also CNull if element was added. It is imposible to have both of them CNull.

Observers in CPdf

CPdf class as PDF file maintainer uses observers for synchronization of structures which may be changed in two ways. This may occure because all attributes can be accesible through properties (cobjects) and also special objects provided by CPdf - CPage for page manipulation, COutline for outlines manipulation and so on. Special objects keeps logic of concrete entities and manipulates with cobject in that way. Property part is without any logic and enables making changes which are not covered by special objects. This advantage and extensibility is payed by additional synchronization.


Property tree doesn't know the way how it was changed.

Page tree synchronization

Page tree, as mentiont above, can be changed from property site of view by making changes directly to the Pdf page tree or by CPdf using


methods. Second way is safer, because all necessary workaround are done correctly here which is not guaranteed by property site, where any kind of data can be supplied.

Synchronization is guarantied by 3 internal [9] classes which implements IObserver interface. To make their logic easier, each observer is specialized for one type of change. So that we have

  • PageTreeRootObserver - which is registered on Pages entry in Pdf document catalog. If this value changes, we have to throw away and invalidate all pages which were returned until now.
  • PageTreeNodeObserver - which is registered on all intermediate nodes. If any of node's dictionary is changed, checks whether it is entry which may affect page tree (Kids, Count or Parent properties are added, removed, or their value is changed). it has to invalidate all subtree of this node.
  • PageTreeKidsObserver - which is registered on all elements of intermediate node's Kids array. If any of this element changes its value (they all should be referencies to Pages or Page dictionary), original node has to be invalidated and also its children if it is intermediate node.

Whenever Kids array changes and as a result subtree of intermediate node is changed, Count entry in node's dictionary has to be changed too and propagated to the page tree root.

All this stuff is implemented by consolidatePageTree and consolidatePageList methods in CPdf. For more implementation details, please see doxygen documentation.

Observers in CPage

Observer for annotation

CPage stores all CAnnotation instance from its Annots array in internal annotStorage. User can change annotatations with addAnnotation or delAnnotation methods but also by changing Annots array directly.

To prevent from problems with inconsistency, CPage instance registers AnnotsArrayWatchDog observer implementation on Annots array and all its members (because they are referencies and someone may change reference value to target which is not annotation dictionary). Whenever this array changes, observer will force to reload annotStorage (see doxygen documentation for more precise information because they are out of scope of this documentation).


Mode controller is class which provides us with property mode configuration. Each property keeps its mode. This is just information (kind of tag) which can be used by property user to determine how he should present it to application user. This is not any kind of restriction how property can be used.

The reason why to create such mode is that Pdf has many objects which are good for changes and some of them are just technical or more they contain meta data. Changes made to meta data or technical objects may lead to total corruption of whole document with very less effort (e. g. change of reference to page tree root leads to all pages lost). We didn't want to have what is possible and what not logic hardcoded in kernel. So we have defined mode for property and GUI which displays objects in property editor may look at property mode and decide whether it provides read only, visible or warning throwing access (GUI can also provide configuration how each mode should be treated).

ModeController class is responsible to provide mode to property according property name (id) and the type of parent type. All pdf properties are part of certain context (parent data). This context may be dictionary or array or object can be indirect and in so it is its own context. ModeController is based on RulesManager (see the section called “Rules manager”). As a rule data type, we use

struct ModeRule
	std::string type;
	std::string name;
	bool operator==(const ModeRule & rule)const
		return type==rule.type && name==rule.name;

structure. Type stands for context type (Type field value of parent data type or itself if object is indirect and so it has no parent) and name stands for name of property in context.

Property matcher implemented for ModeController defines 3 priorities of matching with following logic:

  • original={"", ""} - rule allways matches with PRIO0 priority.
  • original={type, ""} - rule matches if rule.type==original.type with PRIO1 priority.
  • original={"", name} - rule matches if rule.name==original.name with PRIO2 priority.
  • original={type, name} - rule matches if original==rule with PRIO3 priority.
  • PRIO0 < PRIO1 < PRIO2 < PRIO3

This enables to define rules with more or less general meaning and explicit setting for one property.

Inherited RulesManager enables configuration loading from stream by default. We have additionally implemented ModeConfigurationParser which is able to parse strings from configuration file. To be less dependant on particular configuration file format, we have built it upon StringConfigurationParser which simply reads key value pairs. from file and ModeConfigurationParser transforms this values to its internals data types for rule and target (ModeRule and Mode types). This is example of Adapter design pattern implementation.

We have written also example configuration file (this file is stored in data directory - by default /usr/share/pdfedit, but depends on prefix parameter during installation) with quite reasonablerestrictions. For more inforamtion about file format and mode types, please see project doxygen documentation for ModeController.

Mode controller and related classes.


Sometimes it is desireable to display progress of current operation to user. To accomplish this task, we have implemented ProgressObserver class which implements IObserver (see the section called “Observers”) and internally stores implementator of IProgressBar interface.

Responsibility of this class is to implement notify method so that current state of progress given as parameter is displayed by IProgressBar implementator. It has to recognize when progress starts (this is provided by internal flag), when it finishes (when scope's total count is reached).

Instance of this class can be registered on arbitrary descendant of ObserverHandler class which registers PdfWriterObserver [10] compatible observers and uses ScopeChangeContext with OperationStep and OperationScope template parameters.

Progress observer and related classes.

Graphical state

Content stream is a sequence of operations which alter graphical state. We need to obtain some information after each of these operations and xpdf code is no suitable for this sort of things. If appropriate functions are defined and the object is changed a bit, it could be used to display the content stream. It can be easily extended by adding appropriate operations when Xpdf code is very difficult to extend here because it is adjusted just for displaying purposes. Again it has almost zero fault tolerance which was improved by adding some constraints into xpdf code. Functions altering the graphical state were heavily inspired by xpdf code. Most of the code is just copied from xpdf.

Obtaining information after every operator

Each pdf operator has its function which alters the graphical state appropriately. The mechanism used by stateupdater is simple but very flexible. It gets a simple iterator which it uses to iterate through all operators. After performing a function assigned to the operator a functor with all needed information avaliable is called.

StateUpdater class

StateUpdater class

Setting operator bounding boxes

Kernel supplies a special functor to the stateupdater, which gets called after every operation with infromation about the rectangle the operator used. The functor sets this rectangle as the bounding box of the operator.

[8] Each cross reference section and associated trailer is considered to form revision. This means that each document has at least one revision and all incremental update forms new revisions. The newest revision is one at the end of the file.

[9] Classes are internal because they need access to CPdf internals, such as pageList and protected methods for page tree and page list consolidation.

[10] This is just typedef to IObserver implementation with OperationStep template parameter.

Chapter 7. Changes to pdf document

As mentioned before, pdf document is represented by bunch of objects. Most of these objects (with exception of document trailer) are so called indirect objects. Those are accessible by cross reference table which provides object and generation number mapping to file offset where such object is stored.

Pdf format enables document changes natively by so called incremental update mechanism. This enables to have several cross reference tables each describing objects specific for that revision overwritting old values. All objects which have to be changed are just written to the end of file with new cross reference table which describes this new object. All pdf files viewers should be aware of incremental update and if object is accessible from several cross reference tables, the newest one is always used.

Previous very short description says that making changes requires taking over cross reference table manipulation. This has to be done transparently that nobody knows that object is changed and he always gets the most accurate objects. We also want to control who can do the changes and who is just consumer of objects and so he is not supposed to do changes. This is handled by 3 layer model described in following section.

3 layer model

The lowest layer

Cross reference table is mantained by XRef class in xpdf code. This class is responsible for cross reference sections parsing according pdf specification, keeps table of this information inside and indirect objects fetching. [11] XRef class is not designed to be extensible for making changes very well. So we have reused this class as lowest layer in our 3 layer model designed to enable making changes to document. This XRef layer keeps logic of pdf file parsing and correct assignment of object accoring referencies. So basic idea and responsibility is kept.

To enable this reusability in C++ language we had to make some minor changes to xpdf code, basicaly prepare them for transparent dynamic inheritance, so all neccessary methods were virtualized and also data fields turned to protected (from private). (see also the section called “Changes needed for code reuse”).

The middle layer

Second (middle) layer of model is formed by CXref (see the section called “CXRef”) - descendant of XRef class. Prime responsibility is to provide methods which can register changes and keeps all changed objects in its internal state. All methods which enables making changes are not public to hide them from normal usage. They are protected, so they can be reused by descendants. It overwrites public methods from XRef and always use changed objects if they are avialable. Otherwise delegates to lower layer (XRef implementation). This aproach enables to use CXref transparently anywhere where XRef instance is required (e. g. in rest of xpdf code which may be reuseable) with advantage of access to the most accurate values without any special logic from class user. To prevent inconsistencies and to make usage and implementation easier, all methods providing change functionality are protected. They are implemented without any special logic. All changes are stored to the mapping where they can be accessible. No special checking is performed. It is safe to return CXref instance, because it is guarantied that nobody can use this class to make chages. Pdfedit code uses CXref not to be depended on Xpdf XRef class.

CXref is also responsible for new object handling. This means that it provides methods to reserve new reference and add new objects. All new referencies are stored in newStorage container where each new reference is mapped to its current state. If new reference is reserved by reserveRef method, it is marked as RESERVED_REF and after changeObject with given reference is called for the first time it is changed to INITIALIZED_REF state. This state separation enables correct object counting, because just those which are INITIALIZED_REF are counted and also RESERVED_REF reference is not returned twice. This functionality is also protected and so unvisible to instance users and is used by 3rd layer.

Class also implements simple type checking method

	virtual bool typeSafe(Object * obj1, Object * obj2);

which is public and does the following test to guarantee that obj2 can replace obj1 and it would be syntacticaly correct:

  • obj2 has to have same type as obj1 if they are not referencies
  • if at least one is reference, fetched objects has to have same types
  • obj2 may have different type only if obj1 is (pdf) null object.

Note that CXref doesn't use this method internally (as mentioned before it doesn't any checking on values at all), but exports it, so instance user can do the checking for himself (XRefWriter in 3rd layer uses this method in paranoid mode).

The highest layer

Highest layer is represented by the section called “XRefWriter” class - extension of the section called “CXRef” class. Its responsibility is to keep logic upon changes, to enable writting them to the file and to maintain revisions of the document. Logic upon changes means some type checking to prevent object type inconsistency.

For more information about responsibility and functionality separation see following figure.

3 layers diagram

Document saving

As it was mantioned above, PDF format supports changes in document in so called incremental update (all changed objects are appened to document end and new cross reference section for changed objects). This means that each set of changes forms new revision. This brings little task to think about. What should be stored in one revision and which changes are not worth of new revision? User usually wants to save everything because of fear of data lost and doesn't thing about some revisions. If each save created it would lead to mess with horrible number of referencies without any meaning.

Revision saving

XRefWriter provides save functionality with flag. This flag sais how data should be stored with respect to revisions:

  • temporal saving, which dumps all changes with correct cross reference table and trailer at the end of document but doesn't care for it (no internal structures are touched and they are kept as if no save has been done). If any problem occures changed data are stored, so no data lost happens. Whenever save is done again it will rewrite older temporarily saved changes.
  • revision saving, which do the very same as previous one except all internal structures are prepared to state as if this document was opend again after saving. This means that we are working on freshly created revision after saving. This makes sense when user knows that changes made by him are gathered together in one revision and nothing else messes with it. Implementation is straightforward because we just need to force CXref to reopen (call CXref::reopen method) and move storePos behind stored data).

It is up to user to use the way how he wants to save changes. However temporal changes are default and new revision saving is done only if it is explicitly said.

Content writing and IPdfWriter

XRefWriter uses abstract IPdfWriter class to write changed content when save method is called. This enables separation of implementation from design. All saving is delegated to pdfWriter implementation holder and it depends on it how content is writen (see the section called “IPdfwriter”. We have implemented OldStylePdfWriter pdf writer, which writes objects according pdf specification and creates an old style pdf cross reference table (standard for Pdf specification prior to version 1.5, see Cross reference table).

Document cloning

To be able to effectively solve problem with PDF disability to branch document and so making changes to older revisions, XRefWriter brings so called cloning capability (this doesn't anything to do with object cloning mention in other chapters). This means copying document content until current revision (including current one). If user wants to change something in such revision, he can switch to that revision and clone it to different file. Changes are enabled to created document, because current revision in original document is the newest one in cloned document. Nevertheless document merging is not implemented yet, so there is no way to get those changes back to main document (by any of pdfedit component).

Linearized pdf documents

All previously mentioned functionality depends on incremental update mechanism. However pdf document may have format little bit different. Such documents are called Linearized and are designed for environment where it may be problem (e. g. time problem) to wait until whole document is read and so parsing from end of file can start (see Pdf specification Appendix F for more information).

Such documents have special requirements and they are not designed for making changes. 3rd layer handles this situation rather strictly and XRefWriter checks whether given file is linearized during initialization. Some of operations are not implmented in linearized document, such as revision handling and document saving may product not correct document (pdf viewers which strictly relies on linearized information may display different output).

Because many of documents (specialy from internet) are linearized, we have provided Delinearizator class placed in utils. It is able to get rid of linearized structures and create new pdf document which has same objects but normal structure. Usage of the class is very simple, see the following example:

        IPdfWriter * writer=new OldStylePdfWriter();
        Delinearizator *delinearizator=Delinearizator::getInstance(fileName.c_str(), writer);
            printf("\t%s is not suitable because it is not linearized.\n", 
        string outputFile=fileName+"-delinearizator.pdf";
        printf("\tDelinearized output is in %s file\n", outputFile.c_str());
        delete delinearizator;

GUI design description

This part describes some basic principles of graphical user interface. Described is menu settings system, used to manage menu and toolbar items, setting system, which is used to read, write, manage and watch settings and defaults and the tree system, which is used to manage the object treeview on right side of editor window. Some of the principles (settings system) work also in commandline interface, although there is one small difference - in comandline mode there is no system to notify about changed setting, but it is also not necessary, as there is no user interaction in that mode.

Chapter 8. PDF Editor menu and toolbar system

Configuration of menus and toolbars are in the main configuration file pdfeditrc, which is placed in the data directory (by defalt it is configured as /usr/share/pdfedit). Format of the file is 'ini file', where sections have heading composed of its name in square brackets, followed by lines in format "key=value"

In home directory in file .pdfedit/pdfeditrc are user settings. User can redefine any key in any section, thus he have possibility to modify editor menus as he deem appropriate.

Menus and Toolbars

Menus are stored in gui section. Each menu, menu item, toolbar or toolbar item have a unique name (all these share common namespace) - also, menu items often can be used as toolbar items (as long as they have icon) and toolbar items can be used as menu items (as long as they have text).

Lists and items

Each item is either a "list", "label" or "item". These are menus and menu items in context of menu, or toolbars, labels and toolbar items in context of toolbar. Toolbar can also contain special items.

Usage is basically the same for menus and toolbars, but there are some differences:

  • Lists in menu can contain items (menu items) or lists (submenus). Submenus can be nested to any level, but you cannot create a loop. Program will tell you about the loop and will not start if you do so. List used as toolbar can contain only items and labels
  • Menu item does not need icon, toolbar item needs icon
  • Caption in menu items will be used as text of the menu items, caption in toolbar items will be used as tooltip

List called MainMenu will be used as editor's main menu. Key called toolbars contain comma separated list of toolbar names. Only toolbars specified here will be available in editor.

Format of one list. 

items/name_of_list=list Caption of menu,menu item 1, menu item 2, ....

Number of items in one list is not limited, but note that dimensions of screen where the menu or toolbars will be shown are limited. Each menu items referenced in the list must exist, invalid item will cause editor not to start with an error message. If you want to insert separator in the menu, use "-" (single dash) in place of items's name

Caption of item

is text that will be shown as menu text/tooltip. All further parameters (comma separated) following caption are items that will be contained in the list.

Appending items to list. 

items_add/name_of_list=menu item 1, menu item 2, ....

This way you can add items to an existing list. Only one "item_add" field can exist for each list and is used mostly for adding extra items to any of the menus in user configuration file, without having to redefine entire list and its items. This field is composed of comma separated menu items to be appended at end of the menu, syntax of items is same as items in ordinary list.

Format of one item. 

items/name_of_item=item Caption of item,Command,Keyboard_shortcut,Icon


Elements in this definition are comma separated. If any element should include comma in its content, you must escape the comma by backslash, otherwise the comma would be treated as separator of elements. Also, backslash must be escaped by extra backslash.

is name of file with icon image. The icon should may be in PNG, BMP, XBM, XPM or PNM format, although PNG is recommended. Also supported may be JPEG, MNG and GIF - depend on how QT library used to compile the editor is configures. You should not rely on support of these formats when creating menus. Can be omitted or left blank.

Caption of item

is text that will be shown as caption.


is any piece of script that will be run when this item is clicked. For longer or more complex commands it is recommended to define them as functions in init.qs and from menu only call them ... These are two special cases:

  • if the string used as command is "quit", the editor will quit (close all windows)
  • if the string used as command is "closewindow", the editor will close current windows

This specifies keyboard shortcut that will invoke the action. If two different items have the same shortcut, the result is undefined. Can be omitted (if icon is also omitted) or just left blank (no keyboard shortcut). Format is modifiers+key, for example: , ,

Format of one label. 

items/name_of_item=label Caption of label

Label with short description text, usually places before or after other items. Label is valid only in toolbar, menu can not contain a label.

Special toolbar items

Toolbar can also contain some special items, that provide more functionality than just a button. Such items have always name starting with underscore character and therefore it is advised not to use such names for naming your own items to avoid possible collision. You can insert them into any toolbar just by referencing their name in a list. Some of the items have parameter, in such case, the name of the item should be followed by space, followed by the parameter value. For example, _color_tool fg specify _color_tool item with parameter fg.

Table 8.1. List of special toolbar items

_revision_toolTool to list revisions and change them
_zoom_toolTool to change zoom level of currently shown page
_page_toolTool to show and change which page is currently shown
_color_toolTool to pick named color. Parameter specifies name of the color
_edit_toolTool to edit arbitrary text. Parameter specifies name of the text
_number_tool Tool to edit arbitrary number. Parameter specifies name of the number and it is possilbe in script to specify predefined set of number, that will be available in a drop-down box. User is still able to type in a number not present in the list.
_select_tool Tool to select text from list of strings. Parameter specifies name of the text and it is possible in script to specify set of strings, that will be available in a drop-down box.


The name must be unique for special toolbar items that have name as parameter. If you create two toolbar items with same name, the results are undefined.


Path where to look for icons is defined in path section of configuration file. Key icon specifies semiciolon separated list of path, which will be searched from first to last intil the icon is found. Path elements may contain variable references in shell style ($VARIABLE and ${VARIABLE}), which will be expanded automatically. Default icon path is $HOME/.pdfedit/icon;/usr/share/pdfedit/icon. Recommended way for user to install his own icons is doing it in by copying them to .pdfedit/icon subdirectory in his home directory.

Icon themes

PDF Editor support icon themes. Themes can be created by creating subdirectory in any directory from icon path and putting some custom icons there. Theme then can be activated by modifying the theme/current key in the icon section in configuration - set the key to name of the directory with theme. Or it can be simply specified in 'Icon theme' field in editor's configuration dialog.

Editor look first for icon from theme across all directories in icon path, by appending a theme name to the path - so for example if theme is hicolor and one icon directory is $HOME/.pdfedit/icon, editor looks in $HOME/.pdfedit/icon/hicolor for that directory. If a themed icon is not found, editor looks for a default icon in the icon directory itself (which also correspond to a default theme).

Chapter 9. Settings system

PDF Editor uses QSettings and StaticSettings as backend to store its configuration. Internally, StaticSettings is used for static settings (which cannot be modified) and QSettings for dynamic settings. We can see the static settings as system settings and dynamic settings as user settings. Setting uses key/value scheme, where the keys maintain some directory-like hierarchy, with components separated by slash - "/". This helps managing related settings together. Only limitation is that first and last character in settings should not be slash and there should not be two consecutive slashes in the key, otherwise unpredictable behavior might occur.

Any value specified in user settings overrides corresponding value in system (static) settings, so basically only values not found in user settings are searched in system settings.

There is also support for connecting function to a slot that notifies about any setting change. This way the GUI can maintain consistent state with stored settings, independend of way in which the settings is changed, whether it is in option dialog, or by some script.

Static (system) Settings

Static settings are stored in file pdfeditrc. This file is looked for at application start, first in data directory (which is defined in config.h in GUI source directory) and if not found there, next is searched the directory, where the editor executable is located. StaticSettings class handles reading of this file.

File format for static settings

The format is very similar to the format of "ini file", but have been improved to allow indentation and comments, so the file is better readable and manageable.

The file is automatically assumed to be in utf-8 encoding. Any leading or trailing whitespaces on line are ignored, as are any whitespaces between key and equal sign, or between equal sign and value. Blank lines are ignored too. This allow to indent the file to be more human-readable. Basically, not counting comments and blank lines, the file may contain any number of lines with section identifiers and key-value pairs. Section identifier, in format [section path], sets prefix (i.e. the path in which the keys are stored in) for all following keys, until different section identifier is encountered. Key-value pair, in format key=value specifies one setting that maps value to specified key. Any number of whitespaces can be present directly behind and after the equality (=) character For example, value name = MyName in section specified by its heading [settings/part1] is then referenced in editor by its entire key name settings/part1/name and its value would be MyName

Dynamic (user) Settings

Static settings are stored in file pdfeditrc in subdirectory .pdfedit in user's home directory. Nota that most other settings (any custom scripts, icons, etc ...) are stored by default in that directory too. This file is read by QSettings class on application start and when last window is closed, the settings are writen back (if they were modified). Also, the file is explicitly written when user presses Ok or Apply button in Option dialog (to avoid losing settings if the program crashes or is killed) or if requested by script, for example by calling settings.flush().

File format for user settings

The format is basically an "ini file". You can not insert any comments in the file or indent it, as it is not supported by the QSettings configuration file parser and the settings will be overwritten anyway when the file is updated.

Basically, the file may contain any number of sections, each section having one or more key-value pairs. Section identifier, in format [section path], sets prefix (i.e. the path in which the keys are stored in) for all keys in the section, while section named General have special meaning off being section with "empty prefix" (root section) After section identifier, the key-value pairs follow, one on line, in format key=value - they specify one setting that maps value to specified key. After last key-value pair in one section and the next section is one extra blank line. Putting extra blank lines in middle of the section (or anywhere else) is not allowed, as it will break the file format.

Chapter 10. Object tree view

Structure of PDF can be represented as a tree, and as such, it is shown in the treeview. However, there are some problems:

  • Many elements (such as page) are in the tree two or more times, often with different representation (List of pages vs. complicated tree structure in Pages dictionary). When you modify one of them, the others usually change in some way too.

  • The tree contain references (analogous to a symbolic link in unix filesystems) and these references can contain cycles (reference in tree A can point to tree B, while some other reference in tree B point to tree A). In fact, the cycles are very common, for example page always have link to its parent page dictionary in which it is contained.

  • Single item can be referenced multiple times from different parts of tree. Common example are fonts, as one font is usually referenced in Resources dictionary on multiple pages.

  • The tree is very large. Even tree of very small file with single page contain over 1000 items and huge documents (like the PDF specification for example, which have 1236 pages) will have probably over one million tree items. (as the tree items are branched to very detailed level, basically to level of single words in most documents) This is problem, partially because of memory taken by too many items, and mainly, most users are unable to orient in such a large tree effectively.

MultiTreeWindow class

Class providing tree view of PDF objects. It does support multiple tabs, showing individual trees inside them.

Splitting the tree to multiple tabs partially solve the user disorientation problem, as all content streams are opened in tabs, thus their operator tree does not clutter the "main" tree view showing pages, annotations, outlines, etc ...

This window show list of tabs, with one "main" tab that contain the document as root element and zero or more "secondary" tabs, than show some elements from main tree more in detail. The main tree cannot be closed and is fixed to showing the PDF document as its root item. Secondary trees can be closed any time when the user think they are no longer needed.

Single tree in the multi tree window is managed by TreeWindow class

TreeWindow class

Class providing tree view of PDF objects, having one object at root and showing its children. It uses QListView for showing the tree and all items that are inside the QListView are derived from TreeItemAbstract class (which is derived from ordinary QListViewItem class) Also, the TreeWindow bring some limitation to the QListView in it, most notable, you can only put items that are derived from TreeItemAbstract class, not ordinary QListViewItem (if you bypass this limitation, you can expect strange behavior) and the listview must have at most one root item - this is required by GUI logic that tree correspond to something, either the document or some part of it (or the tree is empty). Also, it simplifies some things.

TreeItemAbstract class

This class is inherited from basic QListViewItem, providing some extra functionality (getting QObject wrapper from the treeitem with a getObject() method, managing childs of the treeitem and some base methods to support automatically reloading only the necessary tree items when change is detected) All specific treeitems are derived from this class. The methods needed to fill tree with child items (if any) are purely virtual (abstract).

TreeItemAbstract subclasses

  • TreeItem - Base class for item wrapping boost::shared_ptr<IProperty>. Every low-level object is derived from IProperty. This class is abstract.

    • TreeItemArray - Class wrapping boost::shared_ptr<CArray>. Child items are array elements.

    • TreeItemCStream - Class wrapping boost::shared_ptr<CStream>. Child items are properties from the stream dictionary.

    • TreeItemDict - Class wrapping boost::shared_ptr<CDict>. Child items are dictionary properties.

      • TreeItemOutline - Class wrapping boost::shared_ptr<COutline>. COutline is basically a CDict with few extra methods.

    • TreeItemRef - Class wrapping boost::shared_ptr<CRef>. Child item is the reference target.

    • TreeItemSimple - Class wrapping shared pointer to one of the simple types (CInt, CReal, CBool, CName, CString). Simple types have no child items.

  • TreeItemAnnotation - Class for tree item wrapping boost::shared_ptr<CAnnotation>. CAnnotation is high-level object representiing annotation in a document.

  • TreeItemContentStream - Class for tree item wrapping boost::shared_ptr<CContentStream>. Child items are top-level PDF operators from the content stream.

  • TreeItemOperatorContainer - Class for tree item wrapping arbitrary vector with PDF operators. This is used mainly for displaying selected PDF operators in the tree.

  • TreeItemPage - Class for tree item wrapping page in document (boost::shared_ptr<CPage>) Childs of this tree item are the page dictionary, content streams in page and annotations present in the page.

  • TreeItemPdf - Class for tree item wrapping entire document. This class is a bit exception, as it can be used in multiple modes. One is for representing entire document, in other modes it can be list of document pages or list of document outlines. When it is representing whole document, it contain the document dictionary, list opf outlines and list of pages as childs. When it represent list, it contain all items of the given list type in document (i.e. all outlines or pages).

  • TreeItemPdfOperator - Class for tree item wrapping single PDF Operator. Child can be another suboperator (suboperators can nest arbitrary deep) or operands. Theoretically both, but all PDF operators contain either operators or operands, but not both, so such thing would mean the PDF file is probably corrupted. Operands are derived from IProperty, so tree items representing operands will be derived from TreeItem class.


Adapter design pattern

The adapter design pattern (sometimes referred to as the wrapper pattern or simply a wrapper) 'adapts' one interface for a class into one that a client expects. An adapter allows classes to work together that normally could not because of incompatible interfaces by wrapping its own interface around that of an already existing class.

Object Adapter

Composite design pattern

Composite is an object designed as a composition of one-or-more similar objects (other kinds of shapes/geometries), all exhibiting similar functionality. This is known as a "has-a" relationship between objects. The key concept is that you can manipulate a single instance of the object just as you would a group of them.


Content stream

A content stream is a PDF stream object whose data consists of a sequence of instructions describing the graphical elements to be painted on a page. The instructions are represented in the form of PDF objects, using the same object syntax as in the rest of the PDF document. However, whereas the document as a whole is a static, random-access data structure, the objects in the content stream are intended to be interpreted and acted upon sequentially. Each page of a document is represented by one or more content streams.

Cross reference table

The cross-reference table contains information that permits random access to indirect objects within the file so that the entire file need not be read to locate any particular object. The table contains a one-line entry for each indirect object, specifying the location of that object within the body of the file. (Beginning with PDF 1.5, some or all of the cross-reference information may alternatively be contained in cross-reference streams; see Pdf specification Section 3.4.7, (Cross-Reference Streams). The cross-reference table is the only part of a PDF file with a fixed format, which permits entries in the table to be accessed randomly.

Decorator design patter

The decorator pattern works by wrapping the new "decorator" object around the original object, which is typically achieved by passing the original object as a parameter to the constructor of the decorator, with the decorator implementing the new functionality. The interface of the original object needs to be maintained by the decorator.



ECMAScript is programming language, similar to Javascript (which is in fact extension of ECMAScript). The language is standardized by Ecma International as Standard ECMA-262, also approved as ISO/IEC 16262. For more details, see the specification:

ECMAScript Language Specification, 3rd edition (December 1999)

Pdf dictionary data type

A dictionary object is an associative table containing pairs of objects, known as the dictionary's entries. The first element of each entry is the key and the second element is the value. The key must be a name (unlike dictionary keys in PostScript, which may be objects of any type). The value can be any kind of object, including another dictionary.

Pdf document catalog

The root of a document's object hierarchy is the catalog dictionary, located by means of the Root entry in the trailer of the PDF file (see Pdf specification Section 3.4.4, File Trailer). The catalog contains references to other objects defining the document's contents, outline, article threads (PDF 1.1), named destinations, and other attributes. In addition, it contains information about how the document should be displayed on the screen, such as whether its outline and thumbnail page images should be displayed automatically and whether some location other than the first page should be shown when the document is opened.

Factory method design pattern

The Factory Method pattern is an object-oriented design pattern. Like other creational patterns, it deals with the problem of creating objects (products) without specifying the exact class of object that will be created. Factory Method, one of the patterns from the Design Patterns book, handles this problem by defining a separate method for creating the objects, which subclasses can then override to specify the derived type of product that will be created.

Factory method

Facade design pattern

The facade pattern is an object-oriented design pattern. A facade is an object that provides a simplified interface to a larger body of code, such as a class library.



In C++ context, functor is method implementing functional operator.

GPL - General public licence

General public licence created by Free Software Foundation. The purpose of the GPL is to grant any user the right to copy, modify and redistribute programs and source code from developers that have chosen to license their work under the GPL. See GPL wiki for more information. Licence

Incremental update

The contents of a PDF file can be updated incrementally without rewriting the entire file. Changes are appended to the end of the file, leaving its original contents intact. Each such incremental update adds new cross reference section with trailer and so new revision of document.

Indirect pdf object

Any object in a PDF file may be labeled as an indirect object. This gives the object a unique object identifier by which other objects can refer to it (for example, as an element of an array or as the value of a dictionary entry). The object identifier consists of two parts:

  • A positive integer object number. Indirect objects are often numbered sequentially within a PDF file, but this is not required; object numbers may be assigned in any arbitrary order.
  • A non-negative integer generation number. In a newly created file, all indirect objects have generation numbers of 0. Nonzero generation numbers may be introduced when the file is later updated

Together, the combination of an object number and a generation number uniquely identifies an indirect object. The object retains the same object number and generation number throughout its existence, even if its value is modified.


Invariant is a condition that is always true at a certain point in a program. In context of automatical testing, invariant defines expectation of behavior compared to real result of operation.

Iterator design pattern

The Iterator pattern defines an interface that declares methods for sequentially accessing the objects in a collection. A class that accesses a collection only through such an interface remains independent of the class that implements the interface.


Linearized pdf document

A Linearized PDF file is a file that has been organized in a special way to enable efficient incremental access in a network environment. The file is valid PDF in all respects, and is compatible with all existing viewers and other PDF applications. Enhanced viewer applications can recognize that a PDF file has been linearized and can take advantage of that organization (as well as added hint information) to enhance viewing performance.

Observer design patter

The observer pattern (sometimes known as publish/subscribe) is a design pattern used in computer programming to observe the state of an object in a program. It is related to the principle of Implicit invocation.


Pdf page tree

Pdf stores all page dictionaries organized in so called page tree. This tree distinguish 2 types of nodes:

  • Intermediate tree node - this node's purpose is to collect other nodes as a subtree. It is represented by Pdf dictionary data type which contains element with Kids name and which contains reference to Indirect pdf object of a node (this node can be either intermediate or page node. Intermediate tree node dictionary contains element with Type name and this has to have Pages value. To enable travirsing in the tree, also Count element is required for node's dictionary. This element holds number of page tree nodes in current intermediate node.
  • Page (leaf) tree node - this node contains direct page information. It is represented by page dictionary which has element with Type name and Page value.

Root of the tree is referenced by Pdf document catalog dictionary as Pages element reference.

With well balanced tree, it is possible to access arbitrary page (also for very big amount of pages) in very short time (just fiew hops through intermediate nodes). Pdf creators usually collects pages in chunks in one intermediate node. Also intermediate nodes are collected by 10. This means that document with 1000 pages has page tree with 3 intermediate nodes.


Page tree root is allways intermediate node.

Page tree

pdf stream

A stream object, like a string object, is a sequence of bytes. However, a PDF application can read a stream incrementally, while a string must be read in its entirety. Furthermore, a stream can be of unlimited length, whereas a string is subject to an implementation limit. For this reason, objects with potentially large amounts of data, such as images and page descriptions, are represented as streams.

pdf trailer

The trailer of a PDF file enables an application reading the file to quickly find the cross-reference table and certain special objects. It is represented by dictionary and it is stored at the file end. It contains entries for document root (Pdf document catalog), previous cross reference section and others (see Pdf specifiaction section 3.4.4.).

Wrapper design patter

A wrapper converts the interface of a class into another interface clients expect. Wrappers let classes work together that couldn't otherwise because of incompatible interfaces.