Wednesday, May 11, 2011

Calligra is the Future of Free Software Office Suites

Edit and foreword: Some people told me offline that this blog can be read as aggressive towards LibreOffice. That is not my intention at all. Just look at it as a post highlighting the advantages and strong points of the Calligra Suite. So without further ado...

A couple of days ago Michael Meeks published a blog called 'LibreOffice is the future of Free Software Office suites'. Michael is one of the lead developers of LibreOffice and also one of the founders of the Document Foundation, the organization behind LibreOffice. In that blog he makes a number of points that leads to his conclusion in the title:
  • LibreOffice is vendor neutral
  • LibreOffice is robust to participants leaving
  • Linux distributions are safer with LibreOffice
  • LibreOffice has a different, and better QA model
  • Division is (sadly) sometimes necessary
  • The Document Foundation champions ODF
  • We are transparent about our contributors
Each of those points is a section in the text. If you haven't read the blog already, you should probably do that now before continuing your reading here. It's quite long but it's a good read.

However...

What is obvious when reading that text is that Michael only compares LibreOffice to one other free office suite: OpenOffice.org. He probably has a good platform to stand on when saying that compared to OpenOffice.org, LibreOffice is more future secure.

But let's examine his arguments in relation to another free office suite: the Calligra Suite (short: Calligra). And more importantly, let's examine what he is
not saying.

First of all, all of the advantages mentioned above are also true for Calligra. We don't have the Document Foundation behind us, but we do have one of the largest open source communities on Earth: the KDE community. Other than that, the similiarities are striking.

But what do we have more? We have:
  • Flexibility
  • A Clean and Well Kept Ćode Base
  • Qt
Let's examine them one by one.

Flexibility

The Calligra suite is incredibly flexible. Most of the code form the shared libraries or shared plugins. The applications themselves comprise only a small minority of the total code. Only code that is unique to a particular application is maintained for that application; if anything can be shared, it is immediately moved to a library or a plugin that can be used in all applications.

Calligra also has a strong separation between the engine and user interfaces. This means that it's easy to create new user interfaces for new situations or environments. Today Calligra has two officially supported user interfaces on top of the engine: a standard desktop one and one for smartphones (Calligra Mobile). There is also a new, more general, user interface being developed for tablets and other touch based devices. In addition to that, several companies are at this moment developing their own user interfaces for their own special needs on top of the Calligra engine.

Calligra can be adapted to the needs of new user groups very easily. In OpenOffice.org there was a long-running project called OOo4Kids. This project aimed (aims?) at producing a simplified user interface for kids and took several years to develop. The same could have been done using our Flake technology in 2-3 weeks.

It is also easy to create a subset of the standard features for Calligra by disabling or simply not installing a number of the plugins. If you want to extend Calligra, it is similarly easy to create new plugins and add them to your installation.


A Clean and Well Kept Code Base

Calligra has a clean and well kept code base. This means that code that can be shared is shared. There is no old garbage lying around in the corners. The directory structure is sound.

To mention some things that we don't have:
  • There are no comments in German that have to be translated.
  • There are no large chunks of disabled code that was left for future reference because no version control system was used for a long time.
  • We have one string class instead of five(!).
  • We have one bool variable type instead of four (of which one can take 3 values).
  • We have one textbox implementation for all of Calligra and not one for the text application and one for the presentation application.
In addition to what LibreOffice has, Calligra has one very advanced painting program (Krita), one project management application (Plan) and one note-taking application (Braindump). In the next release there will also be a new application for doing network diagrams (Flow).

All of this is created by 1.1 MLoC (million lines of code), as opposed to 5.5 MLoC for LibreOffice, according to the sloccount tool.

This all means that if you work on Calligra you will:
  • quicker get up to speed with the code base
  • get more done with less work
  • have to step around fewer roadblocks
  • have more fun (ok, this one is subjective)
How much better productivity will this lead to? Let's see in the next section.

Qt

I would wager a significant sum that the main reason why Calligra gets so much done with so little is because of the Qt toolkit, a free C++ toolkit under LGPL. In my opinion, Qt is the best toolkit anywhere for C++, and most likely for any language. Not only is it very, very efficient to work with, it is also very comprehensive and wide-ranging.

And it is developed from the start to be cross-platform. Using Qt means that you will immediately have a head start when it comes to portability. Qt runs on Linux, Windows, Mac OS X and many embedded platforms, and with a native look&feel. Calligra runs on most Unix variants today. It can be built and run on Windows; it is not yet packaged but will soon be.

If Calligra was only using Qt, it would also already run on Mac OS X, but we also use parts of the KDE libraries, some parts of which are not yet ported to the Macintosh platform.

Here are some of the features of the Qt toolkit that we use in Calligra and which applications using other toolkits (or none at all) have to implement themselves:
  • A graphics toolkit with native look and feel on different platforms.
  • Platform abstraction.
  • An advanced UI toolkit with many different layout options that adapt to different dialog sizes.
  • A dialog builder that can autogenerate code (Qt Designer)
  • Advanced graphics primitives and paint model.
  • XML parsing and generation (although we have some code of our own here)
We also use some features from the KDE libraries:
  • An advanced system for internationalization and localization
  • Plugin finding and loading
  • Object embedding, allowing embedding of Calligra components in other applications (KParts)
So how much does this buy us? Let us look at an example.

LibreOffice has a Google Summer of Code project called 'Implementing multi-line edit bar in calc'. The abstract says:


Libre office is an opensource multi-platform productivity suite. Calc is used to perform calculations, analyse information and manage lists in spreadsheets. But currently it is very tedious to edit the cell contents when the contents grows in length. The contents keep spreading horizontally along the single line text input bar provided. The aim of this project is to implement a multi-line re-sizable input bar with a scroll bar and word wrap feature, which when its contents grow will shift the overflowing string to the next line. This will provide easier editing of the cell contents and will provide a more modular user interface.

This sounds like a worthwhile project. To show large formulas in a single line is bad usability. But at what cost?

A Google Summer of Code project is supposed to take a little over 2 months for the students to code, including time to get familiar with the code base and tools. Compare this with doing the same project using Qt: it would be done in 10-20 lines of code as far as I understand. Let's allow for some advanced features and call it 1000 lines. Still a significant difference in effort needed, almost a factor 10. The same difference can be seen in the OOo4Kids project mentioned above.


Conclusion

OpenOffice.org and to a lesser degree LibreOffice have many many more users than Calligra at this point. However, they are all at the desktop using full-sized screens. Because of the big memory and CPU footprint and the heavy intertwining of the UI with the engine they will not be able to adapt to the mobile arena with touch input and much smaller screens in many years.

Calligra on the other hand, already has adapted to those environments. There exists already today a community UI called Calligra Mobile which is well integrated into Maemo 5 on the Nokia N900. This interface is easy to adapt to other operating systems like MeeGo. At the same time, there are several other UI's being developed outside public eyes to be released in a few months.

The Calligra engine is today of high quality and with most of the ODF feature set covered. The user interfaces for the desktop is lagging behind, but the community has just started improving it with the help of user interface designers. To the help we have a clean, efficient code base and the best toolkit on Earth. The community is growing with new developers every week.

This is a race between Achilles and the Tortoise. The Tortoise has a long head start, but there is no way he will be able to keep up with the agility and speed of Achilles. So in the short, maybe medium, term LibreOffice is the future of free software office suites. On the desktop.

In the long run, and already now on non-desktop environments, Calligra is the future of free software office suites.

Sunday, May 01, 2011

StarView Metafiles in Calligra

This blog is a report on a new feature in Calligra that improves the interoperability with OpenOffice.org and LibreOffice.

Background

The OpenDocument Format contains a great idea: so called Replacement Images. The idea is that when you have an embedded object in an ODF file, perhaps not all applications can handle that particular kind of embedded object.

So in addition to the object itself, a saving application has the option to save an extra image of the object alongside of the object itself. That way, the loading application can at least show the contents of the object to the user even if he or she cannot edit the object. The problem lies in the fact that ODF never defines which formats this image can be in. In fact, ODF doesn't even make any difference between bitmap images like PNG or JPG and vector images like SVG (Scalable Vector Graphics) or WMF (Windows MetaFile).

One of the bigger annoyances when interacting with OpenOffice.org or LibreOffice is that they save their replacement images in so called StarView Metafiles. This format is age old and dates back to the old Star Office times when Star Division was still a living company and OpenOffice.org didn't even exist. The format itself is not very complex, at least it's much simpler than, say, WMF or, worse, EMF. The problem is that it's completely undocumented.

I have been trying a long time to find out anything about the format, but no matter who I asked at conferences, in email or in IRC channels, the answer was always UTSL.

Progress

I have been hesitant to UTS since OOo code has a reputation that it's very big and very messy. However just before easter, Pinaraf a.k.a. Pierre Ducroquet started a project to do just that. He did some great detective work and produced a proof-of-concept parser and viewer for parts of the SVM format.

At easter time I got inspired and using his work and my experience from similar work for WMF and EMF in Calligra I started on a real production parser and viewer for SVM. And behold, in 3 days I managed to put together a framework for parsing the SVM records and show replacement pictures in Calligra. Here is a picture that shows the current state of things:



As you can see we can already do polygons, polylines, text and colors. It also does filling and mapping of the image coordinate system to the document coordinate system.

Future work

The main thing still missing is handling of bitmaps. SVM, like other meta formats, can embed bitmap images and also other types. SVM defines a type of embedded DIB (Device Independent Bitmap) that we will have to parse and make into a QImage.

Also, OOo 3.4 that was just released has support for SVG images, a feature which they brag about in their release notes. What they don't mention is that they don't put in pure SVG files as normal pictures or replacement images. Instead they embed the SVG into a very thin layer of SVM. I have no idea why anybody thought this was a good idea, but there it is. For compatibility reasons we in Calligra will have to handle this.

There are also only basic support for many formatting options. We do handle line color, text color and fill color. It's not obvious from the picture above, but there is currently no support for line styles. There is also very little support for different kinds of text formatting like even bold, italic and similar. But I'm very happy with the results so far.

SVM Specification
To fix the fact that there is no specification of the StarView Metafile format, we are creating one as we go along. You will find the specification for the StarView Metafile SVM format here. (Note how I carefully worded the last sentence so that it will be findable by google for anybody wanting a spec. :-) ) At least I hope this is the correct link, I'm not very familiar with gitweb yet.

We will continue to enhance this spec, but it is difficult work because we have to read the uncommented source code of LibreOffice and try to deduce the intentions of the programmers. Help would be very welcome here, also from non-Calligra developers.

Labels: ,